Maximal rank likelihood as an optimization function for speech recognition
نویسندگان
چکیده
Research has shown that rank statistics derived from contextdependent state likelihood can provide robust speech recognition. In previous work, empirical distributions were used to characterize the rank statistics. We present parametric models of the state rank and the rank likelihood, and then based on them, present a new objective function, Maximal Rank Likelihood (MRL), for estimating parameters in a HMM based speech recognition system. The objective function optimizes the average logarithm of the rank likelihood of training/adaptation data. It is a discriminative based estimation process and hence makes the training criterion close to the decoding criterion. Three applications of MRL are discussed. First one is a Linear Discriminative Projection, which optimizes the objective function using all training data and projects feature vectors into a discriminative space with a reduced dimension. The second and third applications are a feature space transformation and a model space transformation, respectively, for adaptation. The transformations are optimized to maximize the rank likelihood of the adaptation data. The experimental results show that the MRL adaptation algorithms outperform the MLLR adaptation.
منابع مشابه
Minimum rank error training for language modeling
Discriminative training techniques have been successfully developed for many pattern recognition applications. In speech recognition, discriminative training aims to minimize the metric of word error rate. However, in an information retrieval system, the best performance should be achieved by maximizing the average precision. In this paper, we construct the discriminative n-gram language model ...
متن کاملUnsupervised Submodular Rank Aggregation on Score-based Permutations
Unsupervised rank aggregation on score-based permutations, which is widely used in many applications, has not been deeply explored yet. This work studies the use of submodular optimization for rank aggregation on score-based permutations in an unsupervised way. Specifically, we propose an unsupervised approach based on the Lovasz Bregman divergence for setting up linear structured convex and ne...
متن کاملFeature dimension reduction using reduced-rank maximum likelihood estimation for hidden Markov models
This paper presents a new method of feature dimension reduction in hidden Markov modeling (HMM) for speech recognition. The key idea is to apply reduced rank maximum likelihood estimation in the M-step of the usual Baum-Welch algorithm for estimating HMM parameters such that the estimates of the Gaussian distribution parameters are restricted in a sub-space of reduced dimensionality. There are ...
متن کاملModel order estimation using Bayesian NMF for discovering phone patterns in spoken utterances
In earlier work, we have shown that vocabulary discovery from spoken utterances and subsequent recognition of the acquired vocabulary can be achieved through Non-negative Matrix Factorization (NMF). An open issue for this task is to determine automatically how many different word representations should be included in the model. In this paper, Bayesian NMF is applied to estimate the model order....
متن کاملGaussian mixture optimization for HMM based on efficient cross-validation
A Gaussian mixture optimization method is explored using cross-validation likelihood as an objective function instead of the conventional training set likelihood. The optimization is based on reducing the number of mixture components by selecting and merging a pair of Gaussians step by step base on the objective function so as to remove redundant components and improve the generality of the mod...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000